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(57) ABSTRACT 

A computerized apparatus for reducing the size of a dictio- 
nary used in a text-to-speech synthesis system are provided. 
In an initial phase, the method and apparatus determine if 
entries in the dictionary, each containing a grapheme string 
and a corresponding phoneme string, can be fully matched 
by using at least one rule set used to synthesize words to 
phonemic data. If the entry can be fully matched using rule 
processing alone, the entry is indicated to be deleted from 
the dictionary. In a second phase, the method and apparatus 
determine if the entry, considered as a root word entry, is 
required in the dictionary in order to support phoneme 
synthesis of other entries containing the root word entry, and 
if so, the root word entry is indicated to be saved in the 
dictionary. If the other entries containing the root word entry 
can have correct phonemic data generated from a combina- 
tion of the root word entries phonemic data and phonemes 
generated from rule set processing, then the other entries are 
indicated to be deleted from the dictionary. After all words 
have been processed by phase one and/or phase two, the 
entries indicated to be saved are aggregated to form a 
reduced dictionary. 

35 Claims, 4 Drawing Sheets 
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COMPUTER APPARATUS FOR 
TEXT-TO-SPEECH SYNTHESIZER 
DICTIONARY REDUCTION 

RELATED APPLICATIONS 

This application is a Continuation of application Ser. No. 
09/212,874, filed Dec. 16, 1998 U.S. Pat. No, 638,968. 
The entire teachings of the above applications are incorpo- 
rated herein by. reference. 

The below described work is related to the subject matter 
disclosed in the following patent and patent application of 
the same assignee as the present invention, the contents of 
which are incorporated herein by reference: 

Title: COMPUTER METHOD AND APPARATUS 
FOR TRANSLATING TEXT TO SOUND 

Inventors: Thomas Kopec and Ginger Chun-Che 
Lin U.S. Pat, No. 6,076,060 Issued Jun. 13, 2000 

Title: COMPUTER METHOD AND APPARATUS 
FOR GRAPHEME TO PHONEME RULE-SET 
GENERATION 

Inventors: Anthony J. Vitale, Ginger Chun-Che Lin 
and Thomas Kopec Ser. No.: 09/179,153 Filed Oct. 
16, 1998 

BACKGROUND OF THE INVENTION 

Generally speaking, a "speech synthesizer" is a computer 
device or system for generating audible speech from written 
text. That is, a written form of a string or sequence of 
characters (e.g., a sentence) is provided as input, and the 
speech synthesizer generates the spoken equivalent or 
audible characterization of the input. The generated speech 
output is not merely a literal reading of each input character, 
but a language dependent, in-context verbalization of the 
input. If the input was the phone number (508) 691-1234 
given in response to a prior question of "What is your phone 
number?", the speech synthesizer does not produce the 
reading "parenthesis, five hundred eight, close parenthesis, 

six hundred ninety-one " Instead, the speech synthesizer 

recognizes the context and supporting punctuation and pro- 
duces the spoken equivalent "five (pause) zero (pause) eight 
(pause) six . . . "just as an English-speaking person normally 
pronounces a phone number. 

Historically the first speech synthesizers were formed of 
a dictionary, engine and digital vocalizer. The dictionary 
served as a look-up table. That is, the dictionary cross 
referenced the text or visual form of a character siring (e.g., 
word or other unit) and the phonetic pronunciation of the 
character string/word. In linguistic terms the visual form of 
a character string unit (e.g., word) is called a "grapheme" 
and the corresponding phonetic pronunciation is termed a 
"phoneme". The phonetic pronunciation or phoneme of 
character string units is indicated by symbols from a pre- 
determined set of phonetic symbols. 

The engine is the working or processing member that 
searches the dictionary for a character string unit (or com- 
bination thereof) matching the input text. In basic terms, the 
engine performs pattern matching between the sequence of 
characters in the input text and the sequence of characters in 
"words" (character string units) listed in the dictionary. 
Upon finding a match, the engine obtains from the dictionary 
entry (or combination of entries) of the matching word (or 
combination of words), the corresponding phoneme or com- 
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bination of phonemes. To that end, the purpose of the engine 
is thought of as translating a grapheme (input text) to a 
corresponding phoneme (the corresponding symbols indi- 
cating pronunciation of the input text). 

5 Typically the engine employs a binary search through the 
dictionary for the input text. The dictionary is loaded into the 
computer processor physical memory space (RAM) along 
with the speech synthesizer program. The memory footprint, 
i.e., the physical memory space in RAM needed while 

io running the speech synthesizer program, thus must be large 
enough to hold the dictionary. Where the dictionary portion 
of today's speech synthesizers continue to grow in size, the 
memory footprint is problematic due to the limited available 
memory (RAM and ROM) in some/most applications. 

15 The digital vocalizer receives the phoneme data generated 
by the engine. Based on the phoneme data together with 
timing and stress data, the digital vocalizer generates sound 
signals for "reading" or "speaking" the input text. Typically, 
the digital vocalizer employs a sound and speaker system for 

20 producing the audible characterization of the input text. 
To improve on memory requirements of speech 
synthesizers, another design was developed. In that design, 
the dictionary is replaced by a rule set. Alternatively, the rule 
set is used in combination with the dictionary instead of 

25 completely substituting therefor. At any rate, the rule set is 
a group of statements in the form 

IF (condition)-then-(phonemic result) Each such state- 
ment determines the phoneme for a grapheme that matches 

30 the IF condition. Examples of rule-based speech synthesiz- 
ers are DECTALK by Digital Equipment Corporation of 
Maynard, Mass. and True Voice by Centigram Communica- 
tions of San Jose, Calif. Though the use of rule sets reduces 
the number of entries required in a dictionary for a speech 

35 synthesizer system, the dictionaries remain relatively large 
in size (i.e., number of entries) compared to other parts of the 
system requiring memory. This is problematic because dic- 
tionaries must be completely stored in memory during the 
speech synthesis process to ensure fast and efficient look-up 

An of entries if needed. 

40 ..... 
These and other problems exist in speech synthesizer 

technology. New solutions have been attempted but with 

little success. As a result, highly accurate and/or memory 

space efficient speech synthesizers are yet to come. 

45 SUMMARY OF THE INVENTION 

Dictionaries used by text-to-speech synthesis systems 
may grow to become quite large. Dictionary size depends on 
how many words or word portions in a particular language 

50 are determined to be too complex, too difficult or too time 
consuming to translate into phonemes by rule set processing 
alone. Such words or word portions are candidates to be 
included as entries in the dictionary. However, certain prob- 
lems are encountered when large dictionaries are used in 

55 text-to-speech synthesis systems as mentioned above. . 
The invention recognizes the problems with prior art 
text-to-speech synthesis systems that use dictionaries and 
provides an apparatus to reduce the overall size of the 
dictionaries used in such systems. Specifically, the invention 

60 uses a two phase dictionary reduction process to eliminate 
entries that are not required in the dictionary. In phase one, 
any entries in the dictionary with respective phonemes that 
can.be fully, gene rated by rules in a rule set are' marked or 
indicated to be deleted from the dictionary. In phase two, any 

65 entries in the dictionary, called root word entries, that can 
provide phonemes for the text-to-speech translation process 
of larger (longer) entries are marked or indicated to be saved 
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in the dictionary, and the entries of longer character strings 
that can be translated using the shorter root word entries in 
conjunction with rules are indicated to be deleted from the 
dictionary. After phase one and/or phase two are complete, 
the invention aggregates the entries marked to be saved or 
removes the entries marked to be deleted and the resulting 
set of entries is stored as the reduced dictionary. 

Phase one or phase two of the invention each may be 
performed independently, followed by the aggregation step. 
Alternatively, phase one may be followed by phase two and 
then by the aggregation process. 

In order for embodiments of phase one to determine if the 
phoneme of an entry in the dictionary can be fully generated 
(and hence the dictionary entry can be fully matched) by 
using the rule set, the invention apparatus generates a 
rule-based phoneme string for the grapheme string of the 
subject entry and then determine if the rule-based phoneme 
string matches the corresponding phoneme string of the 
entry. If there is a match, the subject entry is indicated to be 
deleted from the dictionary, thus reducing overall dictionary 
size. Since rules alone can produce the required phoneme 
string for . the subject entry, the invention recognizes that 
there is no need for the entry to remain in the dictionary. 

Embodiments of phase one may also check if the graph- 
eme string of a dictionary entry is a homograph. If so, the 
preferred embodiment skips to the next entry in the dictio- 
nary for processing. A homograph is a word that can be 
pronounced two different ways but which has one spelling, 
such as "abstract", "wind", and "record". Due to multiple 
pronunciations, homograph dictionary entries are skipped 
since they may have more than one associated phoneme 
string. During text-to-speech processing, the correct pho- 
neme string is selected from a homograph dictionary entry 
based on the context of surrounding language in the text 
being translated. 

Embodiments of phase two determine if dictionary 
entries, referred to as root word entries, are required in the 
dictionary. This is accomplished by the invention combining 
grapheme and phoneme strings of the root word entry from 
the dictionary with respective grapheme and phoneme por- 
tions of an affix rule of an affix rule set of the speech 
syntheses system. This step of combining forms a grapheme 
combination and phoneme combination pair. Phase two then 
determines if the grapheme combination and phoneme com- 
bination pair exists as another matching entry in the 
dictionary, and if so, indicates the root word entry to be 
saved in the dictionary. The matching entry is thus marked 
for removal/deletion. Thus, phase two saves root words in 
the dictionary that can be used to assist in the translation of 
another longer word (the matching entry) in conjunction 
with rule-based processing, and removes the matching 
entries from the dictionary which can be correctly translated 
with a combination of rule processing and root word pho- 
nemes. 

To create the grapheme combination and phoneme com- 
bination pair, embodiments of phase two select and process 
each root word entry in the dictionary. Specifically for each 
root word entry, the invention combines the grapheme string 
of the root word entry with the grapheme portion of the affix 
rule to form a grapheme combination, and combines the 
phoneme string of the root word entry with the phoneme 
portion of the affix rule to form a phoneme combination. 
Then phase two determines if the grapheme combination 
exists as a matching grapheme string in an entry in the 
dictionary. If so, the invention obtains the corresponding 
phoneme string as a matching phoneme string for the 
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matching entry. Then, phase two determines if the phoneme 
combination matches the matching phoneme string, and if 
so, indicates the root word entry to be saved in the dictio- 
nary. Thus, the root words that are saved in the dictionary are 

5 root words that can be used in the translation of the other 
matching entries. Phase two also determines if the matching 
entry has been indicated to be saved in the dictionary. If not, 
the invention indicates the matching entry to be deleted from 
the dictionary. As such, phase two reduces the dictionary 

io size by determining which entries rely on phonemes of root 
words, and saves the root words and deletes entries that can 
be matched by the root words and rule processing. 

By using either phase one or phase two alone, or phase 
one followed by phase two, the invention reduces the 

is number of entries in a dictionary. To that end, the invention 
computer apparatus forms a reduced (i.e., smaller in size) 
dictionary. The reduced dictionary is adaptable to text-to- 
speech synthesis applications requiring minimal storage 
space, entry search time, and dictionary load time. 

20 

BRIEF DESCRIPTION OF THE DRAWINGS 

The foregoing and other objects, features and advantages 
of the invention will be apparent from the following more 
particular description of preferred embodiments of the 
25 invention, as illustrated in the accompanying drawings in 
which like reference characters refer to the same parts 
throughout different views. The drawings are not meant to 
limit the invention to particular mechanisms for carrying out 
the invention in practice, but rather, are illustrative of certain 
30 ways of performing the invention. Others will be readily 
apparent to those skilled in the art. 

FIG. 1 schematically illustrates the operation of a text- 
to-speech synthesis system using rule sets and a dictionary 
to translate words in text to electronically generated speech. 

FIG. 2 is a flow diagram illustrating the two phases of the 
dictionary reduction process of the invention. 

FIG. 3 is a flow chart illustrating the steps involved in 
phase one of the dictionary reduction process of FIG. 2 
40 according to the invention. 

FIG. 4 is a flow chart illustrating the steps involved in 
phase two of the dictionary reduction process of FIG. 2 
according to the invention. 

DETAILED DESCRIPTION OF PREFERRED 
EMBODIMENTS 

• Generally, the present invention provides an apparatus for 
reducing the size of a dictionary used in a text-to-speech 
synthesis system. FIG. 1 illustrates the general operation of 

50 a typical computerized text-to-speech synthesis system 100 
that uses a dictionary 104 that can be reduced in size by this 
invention. In operation, the text-to-speech synthesizer 101 
accepts written text 102 containing words, phrases, names, 
symbols and so forth as input. Speech synthesizer 101 then 

55 employs rule sets 103a through 103c in conjunction with 
dictionary 104 to translate the input text 102 into audible 
electronically generated speech 107. The generated speech 
is output through a speaker device 106 for example. The 
present invention is an apparatus for eliminating unneces- 

60 sary entries in dictionary 104 to reduce its overall size. A 
dictionary reduced in size by this invention requires less 
storage space on disk and in memory when used during the 
text-to-speech translation process performed by text-to- 
speech synthesizer 101. Also, since there are less entries in 

65 dictionary 104 after the reduction process of the present 
invention, the processing time required to load and to search 
the dictionary 104 may be reduced as well. 
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In order to better understand the details of the dictionary 
reduction processes performed by this invention, a brief 
explanation of dictionary entries and rule set structure and 
processing will be presented next. Table 1 below illustrates 
a small example of entries from a dictionary, such as those 5 
that might be found in dictionary 104. The entries in Table 
1 are examples and are not limitations on the present 
invention or speech synthesis system 100. 

TABLE 1 10 

EXAMPLE PORTION OF DICTIONARY 



Dictionary Entry 1 
Dictionary Entry 2 
Dictionary Entry 3 
Dictionary Entry 4 
Dictionary Entry 5 
Dictionary Entry 6 
Dictionary Entry 7 
Dictionary Entry 8 
Dictionary Entry 9 
Dictionary Entry 10 



Grapheme String 


Phoneme String 


aardvark 


'ardvark 


aaron 


'@r|n 


aback 


xb*@k 


abacus 


'(Sjbxkxs 


abalone 


'@bxPoni 


abandon 


xb'@ndxn 


abase 


xb'es 


long 


1'cG 


longing 


l*cG|G 


longingly 


1'cGlGli 



15 



20 



In Table 1, each dictionary entry 1 through 10 contains (i) 
a grapheme (i.e., character) string portion (Column 1) com- 
prising one or more graphemes, and (ii) a phoneme string 
portion (Column 2) comprising one or more phonemes. 
Generally, a grapheme string corresponds to a word in the 
dictionary, but the term "word" as used herein does not 
necessarily mean the formal linguistic unit in the language 
of the dictionary. Rather, some words in the dictionary can 
be portions or segments of longer more formally, commonly 
known words. A single grapheme is any character or symbol 
in the entire alphabet of the language of the dictionary, such 
as English. A grapheme may be a letter "A" through "Z" or 
"a" through "z", numbers such as '0' through '9\ or another 
character or symbol such as "?", "!", "@", and so forth. A 
grapheme string is one or more graphemes appended 
together. 

A phoneme is one or more character symbols used to 
represent a single phonetic utterance or sound that may be 
made when speaking the language of the dictionary. The 
entire set of phonemes for a language represents all possible 
utterances that may be combined to pronounce words in that 
language. A phoneme string is a series of phonemes 
appended together which represent the phonetic pronuncia- 
tion of one or more corresponding graphemes (i.e., a graph- 
eme string). As such, a correctly assembled phoneme string 
represents the phonetic pronunciation for the corresponding 
grapheme string in a given dictionary entry. 

For example, in Table 1, dictionary entry number nine has 
as a subject grapheme string, the word "longing", and 
indicates a corresponding phoneme string of "1'cG G". There 
are sub-strings (i.e., respective graphemes) in the word 
"longing" that correspond to each phoneme in this phoneme 
string. 

In Table l,example dictionary entries 1 through 10 
resemble dictionary entries of words such as those found in 
a normal English dictionary. A dictionary that can be 
reduced by this invention may however contain other infor- 
mation as well, such as word definitions, but this invention 
is not concerned with this other information. Dictionaries 
that can be reduced in size by the invention can be created 
specifically for text-to-speech synthesis systems, or 65 
alternatively, the invention may reduce off-the-shelf com- 
mercially available dictionaries, such as those supplied on 
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CD-ROM's for other types of application programs besides 
speech synthesis. The dictionary to be reduced can be for 
any language, so long as each entry contains a grapheme 
string and a corresponding phoneme string. 

A dictionary not specifically designed for use by a text- 
to-speech synthesis system is usually very large in size, and 
contains entries for most words in a language. Dictionaries 
in the prior art that are designed specifically for text-to- 
speech synthesis systems are usually larger in size than what 
is actually needed to perform the lext-to-speech synthesis 
process. The invention is advantageous since it reduces both 
these and various other types of dictionaries. 

As noted previously, large dictionaries are difficult to 
store entirely in memory during text-to-speech processing, 
since they can be many megabytes in size. Also, performing 
text-to-speech translations by looking up each word in a 
large dictionary is slow compared to mle-based translation 
processing, which will be discussed shortly. Since text-to- 
speech synthesis is typically a real-time process, extremely 
fast processors and large amounts of memory would be 
needed to perform translations using a dictionary alone. 

Accordingly, rule sets (such as 103a, b, and c in FIG. 1) 
are frequently used in text-to-speech synthesis systems 100 
to quickly translate graphemes of words into phonemes 
which may then be converted to audible sounds 107. 
Grapheme-to-phoneme rules, contained in rule sets 103, 
provide a concise way to analyze a character string in the 
language and produce the required phonemic data for sound 
synthesis. Furthermore, rules in a rule set 103 may be 
generic in that they may convert character strings that are 
generally not considered to be words worthy of existing in 
the dictionary 104. 

Each rule set 103a through 103c contains a number of 
rules in the form: 

IF (condition) -then- (phonemic result). Each rule deter- 
mines the proper corresponding phoneme(s) for a grapheme 
string that matches the IF condition. The previously noted 
rule-based text-to-speech synthesizer called DECtalk from 
Digital Equipment Corporation of Maynard, Mass. uses rule 
sets 103 in combination with a dictionary 104 to translate 
text to speech. 

During rule processing, each rule of the rule set 103 is 
considered with respect to the input text 102. Rule-based 
processing typically proceeds one word or unit of text at a 
time from the beginning to the end of the input text. Each 
word or input text unit is then processed by selecting a 
number of graphemes (i.e. characters) from either the 
beginning, middle, or end of the input text 102. The graph- 
emes selected depend upon the rule set being used. If a rule 
condition ("IF- Condition," part of the rule) matches any 
portion of the input text 102, then the text-to-speech syn- 
thesizer 101 determines that the rule applies. As such, the 
synthesizer stores the corresponding phoneme data (i.e., the 
phonemic result) from the rule in a working buffer. The 
synthesizer 101 similarly processes each succeeding rule in 
the rule set 103 against the remaining input text 102 (i.e., 
remainder parts thereof) for which phoneme data is needed. 
After processing all of the text 102 via rules in the rule sets 
103, the working buffer holds the phoneme data correspond- 
ing to the text which may then be converted to audible 
speech. For more complete details on the translation of text 
to sound via rule processing, see co-pending patent appli- 
cation Ser. No. 09/071,441, filed May 1, 1998, entitled 
"Computer Method and Apparatus for Translating Text To 
Sound", which is assigned to the assignee of the present 
application and which is incorporated herein by reference in 
its entirety. 
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Table 2 below illustrates ten example rules from a specific 150 of the reduction process shown in FIG. 2, and may be 
type of rule set, called a suffix rule set (e.g. 103c in FIG. 1) performed independently of phase two which is represented 
used for English text strings. by step 151, Accordingly, the reduction process of the 

invention may begin at either of the "Begin Reduction'* 
TABLE 2 5 indicators 154 or 155 in FIG. 2. 

— ^ — — — Phase one (Step 150) of the invention is based on the 
example portion of a sufhx rule set observation Li an unreduced dictionary 104 may be 

Phonemic Data reduced in size by eliminating (i.e., deleting or removing) 

Grapheme Portion (Phoneme Portion) any entries in the dictionary 104 that can be fully matched 

10 by the rules in rule sets 103«-c in conjunction with rule set 
processing. During text-to-speech synthesis processing, 
entries in the dictionary 104 that occur in input text 102, and 
that may be matched entirely by rules, need not remain in the 
dictionary 104. As such, phase one (Step 150) determines for 
35 each entry in the dictionary 104, if the entry can be fully 
matched (i.e. can have corresponding phonemes generated) 
by using the rules of the rule sets 103a-c, and if so, marks 
or indicates those entries to be deleted from the dictionary 
104.That is, phase one of the dictionary reduction process 
Text-to-speech synthesis systems 100, such as that shown 20 marks for elimination any entries in the dictionary 104 that 
in FIG. 1, may use multiple rule sets to obtain phonemic data can be properly matched or translated to phonemes by the 
(i.e., phonemes) for different parts of a given input text/ m ^ ^ 1**3. 

character string 102 (e.g. individual words). There may be After phase one is complete, phase two (Step 151) is 
rule sets for matching (i) suffixes, which are one or more typically performed next. However, processing may alter- 
graphemes obtained from the end of a character string, (ii) 25 natively bypass phase two (Step 151) by following optional 
prefixes, which are one or more graphemes selected from the processing path 153 to step 152, where the reduced dictio- 
beginning of a character string, and (iii) infixes, which are nary 104nfl is created. 

one or more graphemes selected from somewhere in the Phase two (step 151) is based on the observation that 
middle of the subject text string, between the beginning and ^ some entries in the dictionary 104, called root word entries, 
the end. Suffix and prefix rule sets are called "Affix" rule m ay provide phonemic data for the text-to-speech transla- 
sets, since they match grapheme portions (i.e., strings) tion process of longer words/text strings. As such, these root 
obtained starting from either the beginning or end of a word. worc j entries should not be removed from the dictionary 104 
In FIG. 1, rule set 103-a corresponds to a prefix rule set, rule to reduce its size, since the synthesis of longer words in text 
set 103-6 corresponds to an infix rule set, and rule set 103-c ■ 1Q2 that contain the root words (i.e., are dependent on these 
corresponds to a suffix rule set, for example. ro ot word entries) can be performed using the root word 

The example suffix rules in Table 2 map a respective entries. Furthermore, if longer word entries in dictionary 104 
suffix-like (ending) grapheme portion to corresponding pho- may be translated to phonemes using root word entries in 
nemic data or phoneme portion (i.e., one or more conjunction with rule processing, then the longer word 
phonemes). For example, Rule 9 is used to convert an ending 4Q entries can be removed from the dictionary 104 to even 
text string (i.e., the suffix grapheme string) "ful" to the further reduce its size. Step 151 thus determines if a root 
phoneme string "fL". The suffix rules shown in Table 2 are word entry in the dictionary 104 can be used to support the 
given for example only. A full suffix rule set may contain text-to-speech synthesis of other dictionary entries. If so, 
many more entries than those shown in Table 2. While not then that root word entry is indicated or marked to be saved 
illustrated in a table, rules in a prefix rule set are similar in 45 in dictionary 104. Step 151 also determines, based on that 
nature to the rules in the suffix rule set above, but match root word entry, if longer word entries (i) have not been 
prefix grapheme portions of character strings to prefix previously indicated to be saved in the dictionary 104, and 
phonemic data. Likewise, an infix rule set contains rules for (ii) can be translated via phonemes provided by one or more 
matching infix grapheme portions, obtained from the middle root word entries and rule processing (i.e., the longer word 
of text strings, to phonemic data as well. 5Q entries contain the root word and some other characters). If 

Interestingly, rule sets themselves may be generated by an these two conditions are met, then the longer word entry is 
analysis of dictionary entries containing a grapheme string indicated to be deleted from the dictionary 104. 
and corresponding phoneme strings. A rule set generation As noted previously, phase one (Step 150) may be fol- 
process is described as a separate invention in co-pending lowed by phase two (Step 151. In such cases, phase two can 
U.S. patent application Ser. No. (Unknown) filed October 55 indicate a word to be saved that was previously indicated to 
26, 1998, entitled "Automatic Grapheme-to-Phoneme Rule- be deleted during phase one processing. That is, if phase one 
Set Generation", which is assigned to the assignee of this determines a word (i.e., subject character string) can be 
invention and is hereby incorporated by reference in its matched by rules alone and thus indicates the corresponding 
entirety. Typically, a dictionary having many entries, which dictionary entry is not needed and should be deleted, phase 
has not yet been reduced by the teachings of this invention, 60 mav subsequently reverse this decision and indicate that 
is used for rule set generation in the referenced application. the dictionary entry containing the subject word/character 
After the rule sets have been generated from an analysis of string, which is determined to be a root word of other longer 
the dictionary, the dictionary may then be reduced by phase words, should be saved. 

one and/or phase two of the present invention. After either phase one or phase two or both phase one and 

FIG. 2 illustrates the two phases used in the present 65 two have been completed, step 152 is performed. Step 152 
invention to reduce the size of a dictionary 104 in a creates a reduced dictionary 104-a based on the entries in 
text-to-speech synthesis system 100. Phase one includes step dictionary 104 that have been indicated to be saved and/or 
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deleted by phase one and/or phase two processing. Step 152 the "aardvark" grapheme string in step 201. In step 202, 

may be performed in a variety of ways, with the objective of "aardvark" would be parsed by infix rules in' the infix rule set 

creating reduced dictionary 104-a which is smaller in size 103-6 to produce an infix rule-based phoneme string for this 

(i.e., memory and storage requirements) than initial dictio- word, such as '"ardvark". This resultant rule-based phoneme 

nary 104. Step 152 aggregates entries from the original 5 string may or may not be equivalent to the corresponding 

unreduced dictionary 104 that have been indicated to be phoneme string 208-6 in the current data structure 208/ 

saved, and eliminates entries indicated to be deleted. dictionary 104 entry. 

FIG. 3 illustrates the processing steps for a preferred After a rule-based phoneme string is generated via infix 

embodiment of phase one (Step 150 in FIG. 2). The pro- rule processing in step 202, step 203 normalizes the stress 

cessingof FIG. 3 reduces the number of entries in dictionary 10 notation marks in the generated rule-based phoneme string. 

104 to produce reduced dictionary 104^. To accomplish The exact normalization mechanism depends on the char- 

this, first, word linked list 207 is created by step 200. The acteristics and structure of the rule sets and the dictionary; 

word linked list 207 is a series of data structures 208 that in the preferred embodiment, the stress mark for a syllable 

each contain a single entry from dictionary file 104. Each always precedes a vowel phoneme in the syllable, and the 

data structure 208 includes (a) an indication of the respective 15 rules may place the stress marks further to the left; thus, the 

entry grapheme string 208-a, (b) an indication of the cor- preferred embodiment normalizes stress marks by shifting 

responding phoneme string 208-6, (c) a delete flag 208-c that them to the right until they reach a vowel phoneme. For 

may be set or un-set as needed, and (d) a save flag 208-rf that example, if the rule-based phoneme string for "abase" were 

indicates root words that must be saved. The delete flag "x'bes", the '"" stress mark would be shifted to the right by 

208-c and the save flag 208-rf for each data structure 208 are 2 o one character resulting in the phoneme string "xb'es". Stress 

initially set to false for each word entry. The first data normalization corrects for different, but equivalent place- 

. structure 208 in word linked list 207 corresponds to the first ment of the stress mark relative to the syllable boundaries of 

entry from dictionary 104, the second data structure 208 in the word which can occur due to different dialects of a 

word linked list 207 corresponds to the second dictionary language. 

entry, and so forth. From the example dictionary entries in 2 5 Next, step 204 compares the normalized rule-based pho- 

Table 1 above, the entry for "aardvark" and its phonemic neme string (originally in step 202 with the phoneme string 

data "'ardvark" is stored as the first data structure 208 in the portion 208-6in the subject data structure 208 for the current 

word linked list 207, followed by another data structure 208 dictionary entry in the word linked list 207. The comparison 

for the dictionary entry for "aaron", and so forth. In a is performed to determine if the rule-based phoneme string 

preferred embodiment, each entry in dictionary file i04 is 30 produced from the rule processing of step 202 matches the 

read into memory and stored in the word linked list 207 as phoneme string portion 208-6 of the current data structure 

a separate data structure 208. 208 dictionary entry. A "matclTor "no match" decision is 

Steps 201 through 206 are then performed for each data performed in step 205. If the two phoneme strings do not 

structure 208 in the word linked list 207. Beginning with the match, then the rule-based phoneme string from step 202 is 

first word linked list data structure 208, step 201 attempts to 35 different than the actual phoneme string 208-6for the subject 

match any one of the affix rules from affix rule sets 103-fl data structure 208 entry obtained from the dictionary 104. 

and 103-c to the grapheme string 208-a of the subject data Accordingly, the entry remains in the dictionary 104 and 

structure 208. For instance, step 201 attempts to match suffix processing proceeds to step 201 to process the next dictio- 

rules to the end, and prefix rules to the beginning of nary entry data structure 208. That is, steps 204 and 205 

grapheme string 208-c. If any affix rule matches, processing 40 determine if the rule generated phoneme string for the 

skips to the next word linked list data structure 208 to obtain subject data structure 208 and its corresponding phoneme 

the next grapheme string 208-a. Thus, any dictionary 104 string 208-6from the corresponding dictionary entry are the 

entry words (i.e. grapheme string 208-u of each data struc- same or not. If they are not the same, then infix rules alone 

hire 208 that can initially be matched to an affix (prefix or cannot be used to generate a correct phoneme string for this 

suffix) rule are skipped by phase one. The reason for this is 45 dictionary entry, and the entry should remain in the dictio- 

that words having a prefix and/or a suffix are typically nary. 

complex words which include one or more root words. However, if step 205 determines that the rule-based 

Words containing root words and a prefix and/or suffix that phoneme string and the actual phoneme string 208-6for the 

can be matched with affix rules are dealt with during phase current data structure 208 (i.e., the phoneme string obtained 

two processing. 50 from the corresponding dictionary entry) are the same (i.e., 

If no affix rules match any beginning or ending graphemes they match each other), then step 206 sets the delete flag 

in grapheme string 208-a, step 202 then uses rules in infix 208-c in the data structure 208. This indicates that the 

rule file 103 to generate phonemes based on an analysis of corresponding dictionary entry is to be deleted. In this 

the subject grapheme string 208-fl. That is, step 202 takes the instance, the entry need not remain in the dictionary 104, 

grapheme string (i.e. the dictionary, entry character string/ 55 since rule-based processing alone can generate rule-based 

word) for the subject data structure 208 currently being phonemic. data identical or equivalent to that found in the 

processed by steps 201 through 206 and attempts to parse the phoneme string portion of the entry in dictionary 104. That 

grapheme string 208-a using only grapheme-to-phoneme is, since the subject grapheme can be correctly converted to 

rules from infix rule set 103-6. This parsing process (Step phonemes by infix rules, there is no need to maintain the 

202 ultimately creates a rule-based phoneme string, just as 60 respective entry in the dictionary 104. 

if the grapheme string were input text being translated for After step 205 and/or 206 are complete, processing 

text-to-speech synthesis using infix rules. As noted returns to obtain the next data structure 208 for the next 

previously, rule processing is described in detail in the entry in word linked list 207. After all entries have been 

co-pending U.S. patent application, entitled "Computer processed by steps 201 through 206, phase one is complete. 

Method and Apparatus for Translating Text to Sound." As an 65 Certain data structures 208 in word linked list 207 will have 

example of step 202, lake the first entry in the dictionary their delete flags 208-c set, indicating that the corresponding 

example from Table 1. Assume that no affix rules matched entries are to be deleted from dictionary 104. At this point, 
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if only phase one of dictionary reduction is to be performed, appended to the end of the root word entry's grapheme string 
processing proceeds to step 152 in FIG. 2 via path 153, in 208-a. Step 300 also appends the phoneme portion from the 
order to process the word linked list 207 into a reduced current affix entry to the end or beginning of the phoneme 
dictionary 104-a. Step 152, as noted above, selects those string portion 208-fcof the current root word entry data 
entries not indicated to be deleted for storage in reduced 5 structure 208 being processed, to create a phoneme combi- 
dictionary 104-a. nation. If the current affix entry is a prefix rule, the phoneme 

If phase two is performed after phase one, after all entries portion for this prefix rule is appended to the beginning of 
have been processed by steps 201 through 206 in FIG. 3, the root word entry's phoneme string 208-6. If the current 
processing proceeds to step 300 in FIG, 4 to begin the steps affix entry is a suffix rule, the phoneme portion for this sumx 
of phase two. If phase two is performed without first 10 rule is appended to the end of the root word entry s phoneme 
performing phase one (150 of FIG. 3) of the dictionary string 208-fc. In this manner, step 300 creates a grapheme 
reduction process, processing begins in phase two by ere- combination and phoneme combination pair, 
ating the same word link list 207 containing the same entries As an example of step 300, suppose the current root word 
in data structures 208 as described above with respect to step entry data structure 208 corresponds to Dictionary Entry 8 
200 of phase one. is from Table 1, which has "long" as its grapheme string 208-a 

In either case, phase two consists of two nested loops of and "1'cG" as its phoneme string 208-6. Also suppose the 
processing which are illustrated by the dotted lines labeled current affix entry from affix table 304 corresponds to Suffix 
151 and 305 and titled "for each word" and "for each affix", Rule 2 in Table 2, which has "ing" as a grapheme portion and 
respectively in FIG. 4. The outer loop 151 of phase two "x|G" as a phoneme portion. Since the affix entry is for a 
processing begins by selecting the first data structure 208 20 suffix rule, step 300 combines the dictionary entry grapheme 
from word linked list 207, and proceeds to step 300. Each string "long" (i.e., the root word) and the rule grapheme 
data structure 208 in word linked list 207 is processed by portion "ing" to create the grapheme combination longing' . 
steps 300 through 303, and the data structure 208 that is Step 300 also combines the dictionary entry s phoneme 
currently being processed is called the root word entry. After string 208-f><TcG" with the phoneme portion of the affix 
each root word entry is selected, steps 300 through 303 are 25 entry (i.e. the suffix rule 2) "x|G , to create the phoneme 
then performed for this root word entry for every affix in combination TcG|G". The grapheme combination ^ and pho- 
affix table 304 neme combination pair thus appears as longing lcd|0. 

Affix table 304 is a data structure, such as a table or linked Step 301 then compares this grapheme ^combination and 
list, which has entries that each hold a single grapheme phoneme combination pair with the grapheme string 208-a 
string portion and phonemic data portion for a single respec- 30 and phoneme string 208-fcpair in every omer dictionary 
live affix rule from the affix (i.e., prefix and suffix) rule sets entry stored in each data structure 208 in word linked list 
103-a and 103-c. Just as the word linked list 207 has 207. Step 302 then determines if any of the comparisons 
dictionary entry data structures 208 each containing a graph- match each other. If steps 301 and 302 determnie that 
erne 208-a and phoneme 208-fcpair, each affix table entry another dictionary entry exists in word linked list 207 that 
corresponds to an affix rule and holds the rule's grapheme 35 has the same grapheme strmg 208-a and phoneme string 
string and phoneme string portions. The affix table 304 may 208-6, this other dictionary entry's data structure 208 is 
be created before phase two processing has started, or step called a matching word entry. That is, steps 301 and 302 
300 may create the affix table 304 before processing any data determine if the grapheme combination and phoneme corn- 
structures 208 from word linked list 207. As an example, bination pair created in step 300 exists as a dictionary entry 
affix table 304 may appear just as the rule set in Table 2, 40 elsewhere in the dictionary 104. 

except that the affix table 304 contains an affix entry for all If a match occurs in step 302, it has been determined that 
rules in both the suffix and prefix rule sets 103-a and 103-c the combination of graphemes and phonemes from a root 
(i.e., the affix rule sets). The affix table 304 is created to word along with graphemes and phonemes from an affix rule 
provide access to affix rule information in computer memory can produce the same grapheme and phoneme combination 
in order to increase the speed of phase two processing. In an 45 as another matching entry in the dictionary 104. 
alternative embodiment, step 300 may directly access affix Accordingly, step 303 indicates the current data structure 
rule sets 103-a and 103-c instead of the affix table 304, with 208 for that root word entry to be saved by setting the save 
the same objective. The affix entry that is processed at any flag 208-0* to true. Step 303 then sets the delete flag 208-c in 
point in time is referred to herein as the current affix entry. the matching word entry to true. That is, phase two can 
The objective of phase two is to determine if the current 50 determine the a root word entry previously mdicated to be 
root word entry 208 can provide proper phonemes 208-*for deleted by the delete flag 208-c should actually be saved m 
text-to-speech synthesis of a longer dictionary entry that the dictionary 104 by marking the save flag 208-a- to true. If 
contains the root word entry's grapheme string 208-a as part saved, phase two has, at this point also determined that the 
of its grapheme string. To perform this processing, step 300 root word entry in the dictionary 104 can be used along with 
in FIG 4 creates combinations of the grapheme and pho- 55 the rules to translate the matching word entry, and thus the 
neme strings of a root word entry data structure 208 from the matching word entry is not needed tn the dictionary 104. 
word linked list 207 (i.e., the dictionary) with respective Accordingly, step 303 sets the delete flag 208-c to true for 
grapheme and phoneme portions of affix entries (i.e., rules) the matching word entry (i.e., data structure 208 that 
from the affix table 304 (i.e., the affix rule sets). More matched the grapheme combination and phoneme combina- 
specifically, step 300 appends the grapheme portion from the 60 tion pair) to indicate that the matching word entry is to be 
current affix entry to the respective end or beginning of the deleted. 

grapheme string 208-a of the current root word entry being After steps 302 and/or 303, processing returns to step 300 
processed to create a grapheme combination. If the current where the next entry in affix table 304 is applied to the 
affix entry is a prefix rule, the grapheme portion for this current root word data structure 208 via steps 300 through 
prefix rule is appended to the beginning of the root word 65 303. When no more affix table entries are available, the next 
entry's grapheme string 208-a. If the current affix entry is a data structure 208 from word linked list 207 is selected as 
suffix rule, the grapheme portion for this suffix rule is the current root word entry. When all dictionary entries 
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stored in word Linked list 207 have been processed with all 
of the affix rules in affix table 304, the processing of phase 
two is complete. Processing then proceeds to step 152 in 
FIG. 2 (described above) in order to create the reduced 
dictionary 104-a from the word linked list 207. Any data 5 
structure 208 in word linked list 207 that is indicated as 
having a save flag 2Q$~d marked true, or a delete flag 208-c 
marked false is saved in the reduced dictionary 104-a. Thus, 
a save flag 208-^ marked true overrides a delete flag 208-c 
marked true. Therefore, any word entry data structures 208 10 
having a save flag 208-rf equal to true will be saved in the 
reduced dictionary, regardless of what delete flag 208-c 
indicates. In this manner, phase two considerably reduces 
the size of dictionary 104. 

In a preferred embodiment of the invention, step 202 in 15 
phase one (FIG. 3) only uses infix rule set 103-6to generate 
the rule-based phoneme string from the grapheme string 
208-a of the current data structure 208/dictionary entry. This 
is because infix rule set 103-fccontains a set of rules that 
match individual graphemes (i.e. letters) to individual 20 
phonemes, for the entire alphabet of the language. That is, in 
infix rule set 103-6, there are separate rules for "a", "b", "c", 
and so forth, which match each of these letters to a corre- 
sponding phoneme. By using infix rule set 103-6, step 202 
is certain to always be able to produce at least one complete 25 
rule-based phoneme string from the subject data structure 
grapheme string 208-<z, even if step 202 must match graph- 
emes to phonemes letter by letter. In an alternative 
embodiment, step 202 can use prefix, infix, and suffix rule 
sets for rule processing to generate a rule-based phoneme 30 
string. 

While this invention has been particularly shown and 
described with references to preferred embodiments thereof, 
it will be understood by those skilled in the art that various ^ 
changes in form and details may be made therein without 
departing from the spirit and scope of the invention as 
defined by the appended claims. For example, rearrange- 
ment of certain processing steps in phase one and/or phase 
two may be accomplished while still obtaining the same 
beneficial result of the invention. As an example of such a 
rearrangement, in phase two (FIG. 4), the two processing 
loops could be reversed. Thus, instead of performing the 
processing for each word entry, and then for each affix rule 
on that word entry, processing could be performed for each 
affix rule, and then for each word entry with that affix rule. 
When all word entries were processed for a rule, the next 
affix rule would be selected and processing would repeat 
beginning again with the first word entry. 

Those skilled in the art will recognize or be able to 5Q 
ascertain using no more than routine experimentation, many 
other equivalents to the specific embodiments of the inven- 
tion described specifically herein. Such equivalents are 
intended to be encompassed in the scope of the claims. For 
example, a logic unit, as contemplated by the present 55 
invention, may be implemented as multiple logic sub-units. 

What is claimed is: 

1. An apparatus for reducing the size of a dictionary used 
in a speech synthesis . system having a set of rules for 
determining phonemes from graphemes, the dictionary con* 60 
taining a plurality of entries, the apparatus comprising: 
a logic unit determining if a given entry in the dictionary 
can be fully matched by using rules of the rule set, and 
if so, indicating the entry to be deleted from the 
dictionary; 65 
the logic unit determining if the given entry is required in 
the dictionary in order to support other entries, and if 
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so, indicating the given entry to be saved and aggre- 
gating the entries indicated as to be saved, to form a 
reduced dictionary therefrom; and 
wherein the given entry comprises a grapheme string and 
a corresponding phoneme string. 

2. The apparatus of claim 1 wherein: 

the logic unit generates a rule-based phoneme string for 
the grapheme string of the entry using rules in the rule 
set and determines if the rule -based phoneme string 
matches the corresponding phoneme string of the entry, 
and if so, indicates the entry to be deleted from the 
dictionary. 

3. The apparatus of claim 2 wherein: 

the logic unit provides an affix rule set containing affix 
rules for determining phonemes from beginning and 
ending graphemes of character strings, each affix rule 
having a grapheme portion and a corresponding pho- 
neme portion; and 

the logic unit combining grapheme and phoneme strings 
of a root word entry from the dictionary with respective 
grapheme and phoneme portions of an affix rule of the 
affix rule set to form a grapheme combination and 
phoneme combination pair and determining if the 
grapheme combination and phoneme combination pair 
exists as a matching entry in the dictionary, and if so, 
indicating the root word entry to be saved in the 
dictionary, and, indicating the matching entry to be 
deleted from the dictionary. 

4. The apparatus of claim 3, wherein: 

the logic unit determines if the grapheme combination and 
phoneme combination pair exists as a matching entry in 
the dictionary, respectively for the root entry with each 
affix rule in the affix rule set and determines if an entry 
is required, for each root word entry in the dictionary 
starting with a first root word entry. 

5. The apparatus of claim 3 wherein: 

the logic unit, before generating a rule based phoneme 
string, determines if any affix rule from the affix rule set 
matches a portion of the grapheme string of the entry, 
and if so, skipping to a next entry in the dictionary for 
processing. 

6. The apparatus of claim 3 wherein: 

the logic unit further checks if the grapheme string of the 
entry is a homograph, and if so, skips to a next entry in 
the dictionary for processing. 

7. The apparatus of claim 3 wherein: 

the logic unit combines the grapheme string of the root 
word entry with the grapheme portion of the affix rule 
to form the grapheme combination and combining the 
phoneme string of the root word entry with the pho- 
neme portion of the aflix rule to form the phoneme 
combination. 

8. The apparatus of claim 7, wherein: 

the logic unit further determines if the grapheme combi- 
nation exists as a matching grapheme string in an entry 
in the dictionary, and if so, obtaining the correspond ing 
phoneme string as a matching phoneme string for the 
entry and determining if the phoneme combination 
matches the matching phoneme string, and if so, indi- 
cating the root word entry to be saved in the dictionary 
and indicating the matching entry to be deleted from 
the dictionary. 

9. The apparatus of claim 8 wherein: 

the logic unit normalizes any lexical stress in the phoneme 
combination and the matching phoneme string before 
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determining if the phoneme combination matches the 
matching phoneme string. 

10. An apparatus for reducing the size of a dictionary used 
in a speech synthesis system, the dictionary containing a 
plurality of entries, the apparatus comprising: 

a logic unit determining if a given entry is required in the 
dictionary in order to produce the phoneme string of 
another entry, and if so, indicating the given entry to be 
saved; 

the logic unit creating a dictionary containing entries 
indicated to be saved; 

the logic unit combining grapheme and phoneme strings 
of a root word entry in the dictionary with respective 
grapheme and phoneme portions of an affix rule of the 
affix rule set to form a grapheme combination and 
phoneme combination pair; and determining if the 
grapheme combination and phoneme combination pair 
exists as a matching entry in the dictionary, and if so, 
indicating the root word entry to be saved in the 
dictionary and indicating the matching entry to be 
deleted; and 

wherein the speech synthesis system includes an affix rule 
set containing affix rules for determining phonemes 
from beginning and ending graphemes of character 
strings, each affix rule having a grapheme portion and 
a corresponding phoneme portion. 

11. The apparatus of claim 10 wherein: 

the logic unit determines if the grapheme combination and 
phoneme combination pair exists as a matching entry in 
the dictionary, respectively, for the root word entry with 
each affix rule in the affix rule set. 

12. The apparatus of claim 11 wherein the logic unit 
determines if an entry is required, for each root entry in the 
dictionary, starting with a first root word entry. 

13. The apparatus of claim 12 wherein: 

the logic unit further combines the grapheme string of the 
root word entry with the grapheme portion of the affix 
rule to form the grapheme combination and combines 
the phoneme string of the root word entry with the 
phoneme portion of the affix rule to form the phoneme 
combination. 

14. The apparatus of claim 13 wherein: 

the logic unit determines if the grapheme combination 
exists as a matching grapheme string in an entry in the 
dictionary, and if so, obtaining the corresponding pho- 
neme string as a matching phoneme string for the entry 
and determining if the phoneme combination matches 
the matching phoneme string, and if so, indicating the 
root word entry to be saved in the dictionary and 
indicating the matching entry to be deleted in the 
dictionary. 

15. The apparatus of claim 14 wherein: 

the logic unit normalizes any lexical stress in the phoneme 
combination and the matching phoneme string before 
determining if the phoneme combination matches the 
matching phoneme string. 

16. The apparatus of claim 12 wherein: 

the logic unit saves, in a reduced dictionary, the entries 
that have been indicated to be saved. 

17. The apparatus of claim 12 wherein: 

the logic unit deletes entries that have been indicated to be 
deleted from the dictionary. 

18. The apparatus of claim 12 wherein the entries in the 
dictionary are arranged according to length of grapheme 
string with the shortest grapheme siring first. 
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19. The apparatus of claim 12 wherein: 

the logic unit determines if the grapheme combination and 
phoneme combination pair exists as a matching entry in 
the dictionary, first with rules from the affix rule set for 
5 determining phonemes from beginning graphemes. 

20. The apparatus of claim 12 wherein: 

the logic unit determines if the grapheme combination 
exists as a matching grapheme string in an entry in the 
10 dictionary, and if so, obtains the corresponding pho- 
neme string as a matching phoneme string for the entry; 
and - 

the logic unit determines if the phoneme combination 
matches the matching phoneme string, and if so, indi- 
15 eating the root word entry to be saved in the dictionary 
and indicating the matching entry to be deleted in the 
dictionary. 

21. The apparatus of claim 20 wherein: 

the logic unit normalizes any lexical stress in the phoneme 
20 combination and the matching phoneme string before 
determining if the phoneme combination matches the 
matching phoneme string. 

22. The apparatus of claim 20 wherein: 

the logic unit saves, in a reduced dictionary, entries that 
25 have been indicated to be saved. 

23. An apparatus for reducing the size of a dictionary used 
in a speech synthesis system having a set of rules for 
determining phonemes from graphemes, the dictionary con- 
taining a plurality of entries, the apparatus comprising: 

a logic unit determining, for each entry in the dictionary, 
if the entry in the dictionary can be fully matched by 
using rules of the rule set, and if so, indicating the entry 
to be deleted from the dictionary; 

35 the logic unit creating a reduced dictionary from the 
entries remaining after omitting any entries indicated as 
to be deleted; and 
wherein each entry comprises a grapheme string and a 
corresponding phoneme string. 

40 24. The apparatus of claim 23 wherein: 

the logic unit further generates a rule-based phoneme 
string for the grapheme string of the entry, using rules 
in the rule set, and determines if the rule -based pho- 
neme string matches the corresponding phoneme string 

45 of the entry, and if so, indicating the entry is to be 
deleted from the dictionary. 

25. The apparatus of claim 24 wherein: 

the logic unit further determines, for each entry in the 
dictionary starting with a first entry, if an entry in the 
50 dictionary can be fully matched. 

26. The apparatus of claim 25 wherein: 

the logic unit provides an affix rule set for the speech 
synthesis system, the affix rule set for determining 

55 phonemes from beginning and ending graphemes of 
character strings, and before generating a rule based 
phoneme string, checking if any affix rule from the affix 
rule set matches a portion of the grapheme string of the 
entry, and if so, skipping to a next entry in the dictio- 

6Q nary for processing. 

27. The apparatus of claim 26 wherein: 

the logic unit checks if the grapheme string of the entry is 
a homograph, and if so, skips to a next entry in the 
dictionary for processing. 
65 28. The apparatus of claim 26 wherein: 

the logic unit deletes entries that have been marked as to 
be deleted from the dictionary. 
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29. The apparatus of claim 26 wherein; 

the logic unit saves, in a reduced dictionary, entries that 
have Dot been indicated to be saved. 

30. A speech synthesis system for reducing the size of a 
dictionary containing a plurality of entries, the speech syn- 
thesis system having a set of rules for determining phonemes 
from graphemes, the speech synthesis system comprising: 

a dictionary; 

a text-to-speech synthesizer connected to the dictionary; 

a rule set connected to the text-to-speech synthesizer; 

a logic unit, contained in the text-to-speech synthesizer, 
determining if a given entry in the dictionary can be 
fully matched by using rules of the rule set, and if so, 
indicating the entry to be deleted from the dictionary; 

the logic unit determining if the given entry is required in 
the dictionary in order to support other entries, and if 
so, indicating the given entry to be saved and aggre- 
gating the entries indicated as to be saved, to form a 
reduced dictionary therefrom; and 

wherein the given entry comprises a grapheme string and 
a corresponding phoneme string. 

31. A speech synthesis system for reducing the size of a 
dictionary containing a plurality of entries, the speech syn- 
thesis system comprising: 

a dictionary; 

a text-to-speech synthesizer connected to the dictionary; 

a rule set connected to the text-to-speech synthesizer; 

a logic unit, contained in the text-to-speech synthesizer, 
determining if a given entry is required in the dictio- 
nary in order to produce the phoneme string of another 
entry, and if so, indicating the given entry to be saved; 

the logic unit creating a dictionary containing entries 
indicated to be saved; 

the logic unit combining grapheme and phoneme strings 
of a root word entry in the dictionary with respective 
grapheme and phoneme portions of an affix rule of the 
affix rule set to form a grapheme combination and 
phoneme combination pair; and determining if the 
grapheme combination and phoneme combination pair 
exists as a matching entry in the dictionary, and if so, 
indicating the root word entry to be saved in the 
dictionary and indicating the matching entry to be 
deleted; and 

wherein the speech synthesis system includes an affix rule 
set containing affix rules for determining phonemes 
from beginning and ending graphemes of character 
strings, each affix rule having a grapheme portion and 
a corresponding phoneme portion. 

32. A speech synthesis system for reducing the size of a 
dictionary containing a plurality of entries, the speech syn- 
thesis system having a set of rules for determining phonemes 
from graphemes, the speech synthesis system comprising: 

a dictionary; 

a text-to-speech synthesizer connected to the dictionary; 

a rule set connected to the text-to-speech synthesizer; 

a logic unit, contained in the text-to-speech synthesizer, 
determining for each entry in the dictionary, if the entry 
in the dictionary can be fully matched by using rules of 
the rule set, and if so, indicating the entry to be deleted 
from the dictionary; 

the logic unit creating a reduced dictionary from the 
entries remaining after omitting any entries indicated as 
to be deleted; and 
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wherein each entry comprises a grapheme string and a 
corresponding phoneme string. 

33. A computer program product comprising: 

a computer usable medium for reducing the size of a 
dictionary containing a plurality of entries, the speech 
synthesis system having a set of rules for determining 
phonemes from graphemes; 
a set of computer program instructions embodied on the 
computer usable medium, including instructions to: 
determine if a given entry in the dictionary can be fully 
matched by using rules of the rule set, and if so, 
indicating the entry is to be deleted from the dictio- 
nary; 

determine if the given entry is required in the dictionary 

in order to support other entries, and if so, indicating 

the given entry to be saved; 
aggregate the entries indicated as to be saved, to form 

a reduced dictionary therefrom; and 
wherein the given entry comprises a grapheme string 

and a corresponding phoneme string. 

34. A computer program product comprising: 

a computer usable medium for reducing the size of a 
dictionary used in a speech synthesis system, the dic- 
tionary containing a plurality of entries; 
a set of computer program instructions embodied on the 
computer usable medium, including instructions to: 
determine if a given entry is required in the dictionary 
in order to produce the phoneme string of another 
entry, and if so, indicating the given entry to be 
saved; 

create a dictionary containing entries indicated to be 
saved; 

combine grapheme and phoneme strings of a root word 
entry in the dictionary with respective grapheme and 
phoneme portions of an affix rule of the affix rule set 
to form a grapheme combination and phoneme com- 
bination pair; 

determine if the grapheme combination and phoneme 
combination pair exists as a matching entry in the 
dictionary, and if so, indicating the root word entry 
to be saved in the dictionary and indicating the 
matching entry to be deleted; and 

wherein the speech synthesis system includes an affix 
rule, set containing affix rules for determining pho- 
nemes from beginning and ending graphemes of 
character strings, each affix rule having a grapheme 
portion and a corresponding phoneme portion. 

35. A computer program product comprising: 

a computer usable medium for reducing the size of a 
dictionary used in a speech synthesis system having a 
set of rules for determining phonemes from graphemes, 
the dictionary containing a plurality of entries; 
a set of computer program instructions embodied on the 
computer usable medium, including instructions to: 
determine, for each entry in the dictionary, if the entry 

in the dictionary can be fully matched by using rules 

of the rule set, and if so, indicating the entry to be 

deleted from the dictionary; 
create a reduced dictionary from the entries remaining 

after omitting any entries indicated as to be deleted; 

and 

wherein each entry comprises a grapheme string and a 
corresponding phoneme string. 



